Detecting Speculative Language Using Syntactic Dependencies and Logistic Regression
نویسندگان
چکیده
In this paper we describe our approach to the CoNLL 2010 shared task on detecting speculative language in biomedical text. We treat the detection of sentences containing uncertain information (Task1) as a token classification task since the existence or absence of cues determines the sentence label. We distinguish words that have speculative and non-speculative meaning by employing syntactic features as a proxy for their semantic content. In order to identify the scope of each cue (Task2), we learn a classifier that predicts whether each token of a sentence belongs to the scope of a given cue. The features in the classifier are based on the syntactic dependency path between the cue and the token. In both tasks, we use a Bayesian logistic regression classifier incorporating a sparsity-enforcing Laplace prior. Overall, the performance achieved is 85.21% F-score and 44.11% F-score in Task1 and Task2, respectively.
منابع مشابه
Evaluating automatic annotation: automatically detecting and enriching instances of the dative alternation
In this article, we automatically create two large and richly annotated data sets for studying the English dative alternation. With an intrinsic and an extrinsic evaluation, we address the question of whether such data sets that are obtained and enriched automatically are suitable for linguistic research, even if they contain errors. The extrinsic evaluation consists of building logistic regres...
متن کاملDependency Parsing with Reference to Slovene, Spanish and Swedish
We describe a parser used in the CoNLL 2006 Shared Task, “Multingual Dependency Parsing.” The parser first identifies syntactic dependencies and then labels those dependencies using a maximum entropy classifier. We consider the impact of feature engineering and the choice of machine learning algorithm, with particular focus on Slovene, Spanish and Swedish.
متن کاملDetecting Corporate Financial Fraud using Beneish M-Score Model
Detecting financial fraud is an important issue and ignoring this issue may cause financial and non-financial losses to individuals and organizations. The aim of this study is to test the ability of Beneish M-Score Model for detecting financial fraud among companies listed on Tehran stock exchange. The research sample consists of 137 companies listed on Tehran Stock Exchange for a period of 11 ...
متن کاملCombining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL
Reading is known to be an essential task in language learning, but finding the appropriate text for every learner is far from easy. In this context, automatic procedures can support the teacher’s work. Some tools exist for English, but at present there are none for French as a foreign language (FFL). In this paper, we present an original approach to assessing the readability of FFL texts using ...
متن کاملVariability is the spice of learning, and a crucial ingredient for detecting and generalizing in nonadjacent dependencies
An important aspect of language acquisition involves learning the syntactic nonadjacent dependencies that hold between words in sentences, such as subject/verb agreement or tense marking in English. Despite successes in statistical learning of adjacent dependencies, the evidence is not conclusive for learning nonadjacent items. We provide evidence that discovering nonadjacent dependencies is po...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010